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Abstract 

The common utilization-based definition of available bandwidth and many of the existing tools to estimate it sufifer from several 
important weaknesses: i) most tools report a point estimate of average available bandwidth over a measurement interval and do not 
provide a confidence interval; ii) the commonly adopted models used to relate the available bandwidth metric to the measured data 
are invalid in almost all practical scenarios; iii) existing tools do not scale well and are not suited to the task of multi-path estimation 
in large-scale networks; iv) almost all tools use ad-hoc techniques to address measurement noise; and v) tools do not provide enough 
flexibility in terms of accuracy, overhead, latency and reliability to adapt to the requirements of various applications. In this paper 
we propose a new definition for available bandwidth and a novel framework that addresses these issues. We define probabilistic 
available bandwidth (PAB) as the largest input rate at which we can send a traffic flow along a path while achieving, with specified 
probability, an output rate that is almost as large as the input rate. PAB is expressed directly in terms of the measurable output rate 
and includes adjustable parameters that allow the user to adapt to diff'erent application requirements. Our probabilistic framework 
to estimate network-wide probabilistic available bandwidth is based on packet trains, Bayesian inference, factor graphs and active 
sampling. We deploy our tool on the PlanetLab network and our results show that we can obtain accurate estimates with a much 
' smaller measurement overhead compared to existing approaches. 

'Keywords: Bayesian inference, active sampling, belief propagation, network monitoring. 
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% Introduction 

' Recent work has shown that the performance of applica- 
tions such as overlay network routing fl, 2] and anomaly detec- 
tion yD can be improved significantly when the network-wide 
.available bandwidth is known. There are many more applica- 
tions (SLA compliance, network management, transport pro- 
tocols, traffic engineering, admission control) that could also 
benefit from this information, but existing tools that measure 
available bandwidth generally do not meet the requirements of 
these applications in terms of accuracy, overhead, timeliness 
and reliability 101 ■ 

The most popular estimation tools are founded on either the 
probe-gap (PGM) or probe-rate model (PRM). The PGM as- 
sumes a single-hop path with FIFO queuing and fluid cross- 
traffiqj. One measurement consists of sampling cross-traffic by 
observing the gap between a packet pair at both the input and 
the output. With every measurement, a single point estimate of 
the available bandwidth can be produced as long as i) the capac- 
ity of the tight link is known, ii) there is only one tight link and 
it is the same as the narrow link and iii) the end-nodes can trans- 
mit faster than the available bandwidth. PGM-based tools (e.g.. 
Spruce isll, IGI ||6[]) are lightweight and fast, but are unable to 
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' Traffic is modelled as a continuum of infinitely small packets with an aver- 
age rate that changes slowly. 



estimate the available bandwidth of multi-hop paths |7]. The 
probe-rate model (PRM) also assumes fluid cross-traffic, but is 
more robust. The PRM relies on the principle of self-induced 
congestion probing |8]: if probes are sent at a rate smaller than 
the available bandwidth then the output rate matches the prob- 
ing rate. However, if the probing rate is greater than the avail- 
able bandwidth, packets get queued, which results in unusual 
delays and a smaller output rate. Algorithms constructed us- 
ing the PRM (e.g., Pathload [9], pathChirp |8]) consist of vary- 
ing the probing rate to identify the boundary that separates the 
two diff'erent behaviours described above: an input rate where 
probes start experiencing unusual delays. These methods gen- 
erate more accurate estimates than PGM-based tools, but they 
are also more intrusive because they require multiple iterations 
at diff'erent probing rates. 

In addition to the lack of flexibility, existing models and tools 
suff'er from four other major weaknesses: 

1. The vast majority report a single value representing av- 
erage available bandwidth and the usefulness of this sin- 
gle value is questionable. Available bandwidth is typi- 
cally defined as the capacity of a path unused by cross- 
traffic over a specified time period. Most tools produce a 
single point estimate of the available bandwidth by mak- 
ing multiple measurements using probes sent throughout 
the time period of interest. The cross-traffic often fluc- 
tuates significantly over the time period, so probes experi- 
ence very different network conditions; an estimate formed 
from such data can be a high-variance quantity making a 
confidence interval very valuable. Service (or response) 
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curves are more informative than single average estimates; 
they present the statistical mean (asymptotic average) of 
the output rate for an entire range of input rates 1 10]. How- 
ever, each point on the curve is still an average that does 
not really provide a meaningful reflection of the burstiness 
of the traffic and the variability of the available bandwidth 
metric. A more robust and practically-relevant manner 
to express the available bandwidth is the variation range 
(confidence interval) proposed by Jain and Dovrolis (9|]. 

2. The observation model relating measured data to the 
utilization-based definition of available bandwidth is inac- 
curate and biased in most practical situations. As a result, 
the value provided by most tools does not genuinely reflect 
the quantity the tools claim to estimate. The fluid cross- 
traffic assumption underpins the vast majority of models 
used for inference. Liu et al. [10] show that the assumed 
relationships between the measured quantities (packet dis- 
persion, one-way delay, output rate) and the estimated 
value (utilization, unused capacity) are not sound; even 
for simple, slightly more realistic scenarios, the adoption 
of a fluid model leads to significant underestimates of the 
available bandwidth (unused capacity). 

3. The mechanisms used by most tools to handle measure- 
ment noise are ad-hoc and, in many cases, inadequate. 
Measurement errors and noise generated by the end-hosts 
and routers along the end-to-end path are unavoidable in 
practice. Common issues include route changes, out-of- 
order packet delivery, packet replications, errors in the 
probing packets due to link quality issues, incorrect packet 
time stamps, and poor Network Interface Card utilizations. 
Although measures can be adopted to prevent some of 
these errors, it is impossible to eradicate them all. It is 
important that the model and inference technique are ro- 
bust, and that they can tolerate and handle noisy measure- 
ments. One example of a technique that does handle noise 
more robustly is Traceband IJliI , which employs a hidden 
Markov model that allows the technique to statistically ad- 
just to noise in the measurements. 

4. Current tools cannot be applied to larger networks to si- 
multaneously estimate the available bandwidths of multi- 
ple paths. Using existing tools, probing all paths concur- 
rently not only introduces an unacceptable overhead and 
overloads hosts, but also leads to significant underestima- 
tion due to interference between the probes on links shared 
by multiple paths |12]. The alternative to simultaneous 
measurements is to sequentially probe each path indepen- 
dently. This is unacceptably time-consuming and very in- 
efficient, however, because it ignores the significant corre- 
lations that arise in available bandwidth metrics when the 
network paths share links. 

In this paper, we tackle the problem of network-wide (multi- 
path) available bandwidth estimation. In developing our ap- 
proach, we strive to address the issues we have identified above. 
This problem can be related to large-scale network inference. 



There are similarities with network tomograph}Q which con- 
sists of estimating either i) link-level parameters based on end- 
to-end measurements; or ii) path-level traffic intensity based on 
link-level traffic measurements 1.14.] . There are two key dif- 
ferences. First, tomography involves a mapping from path- 
level measurements to link-level metrics or vice versa; in the 
network-wide available bandwidth problem we are interested 
in estimating path-level metrics from path-level measurements. 
Second, in most network tomography problems, there is a linear 
relation of the form y - Ax between measurements y and net- 
work parameters x, where A is a routing matrix. In our problem, 
this relationship is non-linear; one of our modelling assump- 
tions is that the available bandwidth of a path is the minimum 
of the available bandwidffis of all its constituent links. 

The task is more closely related to the problem of network 
kriging 115] , which involves estimating (functions of) path- 
level metrics ffiroughout a network using end-to-end paffi mea- 
surements. This problem was also addressed in JmllTll . where 
an algebraic approach was proposed for exactly recovering, un- 
der the assumption of no noise, ffie paffi level metrics of all the 
end-to-end paths in a network by monitoring only a small sub- 
set of the paths. The method in [15.] reduced this monitoring 
cost even further, at the expense of introducing a small error 
in the estimated metrics. For real-time applications, estimates 
must not only be produced with minimal overhead, but also in 
a timely manner To meet these requirements, measurements, 
even for a reduced subset of paths, must be scheduled at the 
same time. To avoid simultaneous probes interfering with each 
other and overloading nodes. Song and Yalagandula |18] pro- 
pose a resource-aware technique that achieves better accuracy 
than resource-oblivious methods at the cost of using more mea- 
surement data. All of these approaches, as well as the wavelet- 
based methodology described in |19], are only appropriate for 
(approximately) additive metrics, such as loss or delay, where 
a linear relationship can be constructed between the link-level 
and path-level metrics. However, Song and Yalagandula llSll 
suggest that their approach could be extended to available band- 
width estimation by selecting paffis such ffiat the load of their 
probes only represents a small fraction of the capacity of each 
link. 

Large-scale (multi-path) estimation of available bandwidth 
has not received as much attention as other metrics. To limit 
measurement overhead, BRoute [20] capitalizes on the spa- 
tial correlation between links shared by many paths and the 
observation that 86% of Internet bottleneck links are within 
four hops (end-segments) from end nodes [21]. The tool first 
uses traceroute landmarks to identify AS-level end segments for 
each node, and then measures available bandwidth on these seg- 
ments by using landmarks with high downstream bandwidth. 
Maniymaran and Maheswaran ll22ll propose a more efficient 
landmark-based approach that is similar to BRoute but has re- 
duced storage and inference complexity. Another approach to 
large scale available bandwidth estimation is to exploit the cor- 
relation between various metrics (route, number of hops, capac- 
ity and available bandwidth); since the measurement cost for 
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each metric is different, monitoring those that have a cheaper 

cost can reduce the load on the network ll23ll . To further reduce 

I — I 
the amount of probing overhead, Man et al. [24] propose to re- 
shape existing TCP traffic to look like packet pairs, trains or 
chirps so that no extra traffic is injected in the network. Despite 
these efforts to minimize the overhead of the estimation pro- 
cedure, most of these network-wide tools do not address any 
of the concerns mentioned earlier; they are neither flexible nor 
robust to noisy measurements, they produce a single average 
value for each path and they are based on an invalid mapping 
between measurements and the inferred metrics. 

1.1. Contributions 

We present a novel system that addresses the five weaknesses 
discussed above. Our solution includes i) a probabilistic -rate- 
base definition for the available bandwidth and ii) a network- 
wide estimation tool. 

Our implementation uses the Bayesian inference framework, 
factor graphs and the beUef propagation algorithm to fuse the 
information obtained from all measurements. We adopt a model 
that relates the PAB of each path to the PAB of its constituent 
links; the factor graph provides a mechanism for capturing this 
model and enables computationally efficient inference. These 
techniques have been successfully used in large-scale network 
problems, such as link loss inference applications 11251 1261] and 
the computation of conditional entropies for both fault diagno- 
sis and most informative test selection 127142911 . but not yet in 
the context of available bandwidth estimation. 

Another novel contribution is our algorithm to determine 
which path and rate to probe at each iteration; a process that 
can be related to sequential Bayesian sampling 13011 and ac- 
tive/adaptive sampling [31]. This sampling strategy consists of 
selecting the next measurement(s) based on the information ac- 
quired previously, such that the expected information gain is 
maximized. In networking, it has been used in the context of 
network tomography to determine the measurements that pro- 
vide the best information gain about the network path property 
given their probing overhead [32], but has yet to be applied to 
available bandwidth estimation. 

The rest of this paper is organized as follows. In Sect. |2] we 
introduce a new metric, probabilistic available bandwidth, and 
formally state the estimation problem. In Sect. [3] we detail our 
novel probabilistic framework, which is the first to combine fac- 
tor graphs and active sampling to estimate available bandwidth. 
In Sect. |4] we present results from our simulations and online 
experiments on the PlanetLab network. In Sect. |5] we summa- 
rize our contributions and discuss future work. 



rate, with specified probabilitjO at least y. More formally, for 
given e > and y > 0, we seek the largest input rate such that 
Pr(r' > rp - e) > y. We denote the largest such ingress rate 
by yp and refer to it as the probabilistic available bandwidth for 
path p: 

yp = max Pr(r^ > r^ - e) > y. 

The probabihstic available bandwidth is located at the boundary 
of two regions with different behaviours (i.e., where we can 
expect different outputs). For smaller rates, rp < yp, there is 
a probability greater or equal to y that the output rate will be 
within a margin of e of the input rate. For input rates greater 
than the PAB, rp > yp, this probability is not guaranteed. 

We believe that this new definition for available bandwidth is 
more robust and practical for several important reasons. First, 
it provides a more valid mapping between the measured and in- 
ferred quantities. By expressing available bandwidth directly in 
terms of the input and output rates, there is no longer a need 
to bridge the gap between packet dispersion and unused capac- 
ity through generally invalid modelling assumptions. Second, 
the probabilistic framework gives flexibility to the user and is 
more resistant to variability (cross-traffic burstiness) and noise 
(errors) in the measurements. The values of the two parameters 
e and y are defined by the user based on application require- 
ments and the network environment. For example, increasing 
the value of y results in a more conservative (smaller) estimate 
of the probabilistic available bandwidth. In a network where 
frequent measurement errors occur, the value of e can be in- 
creased, if the application can tolerate a certain reduction in 
output rate. Last, it represents a more practical and concrete 
quantity: the probability that transmitting data at a given rate 
will yield the desired (same) output rate. 

2.1. Problem Statement 

We focus on the problem of network-wide available band- 
width estimation, but in terms of our newly introduced metric, 
probabilistic available bandwidth. More formally, for a speci- 
fied (e, y) and network that consists of a set of A^ links X and M 
paths P, we wish to form estimates of the probabilistic avail- 
able bandwidths of all paths in the network. Let the PAB of 
each path p be modelled as a discrete! random variable y,,; e.g., 
Priyp - r) being the probability that the PAB on path p is r. 

We use an iterative probing strategy where, for each mea- 
surement, we wish to determine if the probing rate is greater 
or smaller than the probabilistic available bandwidth. At each 
iteration k, we evaluate a binary outcomqf] Zk that specifies 
whether the egress rate was within e of the ingress rate. Then 



2. Probabilistic Available Bandwidth 

We specify the probabilistic available bandwidth (PAB) met- 
ric directly in terms of input rates and output rates of traffic on 
a path. We are interested in determining the largest input rate rp 
at which we can send a traffic flow along a path while achieving 
an output rate r' that is almost (within e) as large as the input 



^The probability is defined over all possible multi-packet flows of average 
rate equal to the input rate that can complete transmission during the specified 
measurement period. 

"* We chose to define yp as a discrete, rather than continuous, random variable 
because it not meaningful to have an infinite precision on the transmission rates. 

^Despite the loss of information, we choose to produce a binary outcome 
rather than use the output rate directly for two reasons. First, a binary out- 
come is more robust and less sensitive to noisy measurements. Second, there 
is no available likelihood model for the output rate and it is easier to construct 
empirically an accurate one for the binary outcome. 



at any given instant k, we are interested in the marginal poste- 
rior Pr(3;p|z) for every path p, where z - [zi,. . .,Zk]- Our goal 
is to identify a probing method and the most informative mea- 
surement at each iteration in order to form the PAB estimates, 
such that the credible intervals of the estimates (based on the 
marginal posteriors) are acceptably tight and the measurement 
overhead is minimal. 

PrO'plz) 




establish confidence intervals for the PAB, and iii) identify mea- 
surements (choose the path and probing rate) at each iteration 
that will minimize the overhead on the network. A general 
overview of our approach is presented in Fig. |2l We will ex- 
plain each line (except for the termination criteria of line 2 and 
7 presented in Sect. 12.11 ) in the rest of this section. 

1 create factor graph using known topology; 

2 while 3p s.t. Pp > p Ao 



^yp 



9 

10 end 



choose path to probe next; 

choose rate to probe; 

take new measurement; 

run belief propagation (update marginal posteriors 

Prljplz)); 

if maximum number of probes is reached then 

break; 
end 



Figure 1 : Graphic representation of the probabilistic available bandwidth. The 
probability that Vp lies in the confidence range \limm,Pinax\ "t size/3p is equal 
to 7] (confidence level). 

Rather than estimating the PAB by a single value, we identify 
a confidence interval likely to include it. For a given distribu- 
tion, such as the one depicted in Fig. [T] the confidence interval 
of size Pp with confidence limits \fimm,Pnmx\ is the smallest in- 
terval that has a confidence level (fraction of probability mass) 
greater than 77. The estimation procedure terminates when the 
size of the confidence interval of each path is smaller than p 
(V/7 : Pp < P). For the cases when the variability of the mea- 
surements is too high to meet the desired tightness for confi- 
dence intervals, the procedure also stops when the maximum 
number of iterations is reached. 

The value of 77 and the desired size of the confidence inter- 
val p (how tight the interval is) are both defined by the user 
depending on application requirements and determine how ac- 
curate, fast and intrusive the estimation tool is. For example, a 
larger /? or smaller 77 will generally require a smaller number of 
measurements, which leads to a faster estimation with a smaller 
overhead, but also a less accurate one. It is important to under- 
stand the distinction between j and 77. The confidence level for 
a path ?7 is the probability that yp lies in the confidence inter- 
val of size Pp bounded between j6,„,>, and Pmax- The probability 
of success y represents the probability, for rates smaller than 
the probabilistic available bandwidth yp, that the output rate is 
within a margin of e of the input rate. 

3. Methodology 

Our main challenge is to develop a technique to estimate 
probabilistic available bandwidth that is efficient and scales 
well with the number of paths. We can divide this problem 
into the following three tasks: i) measure a path and produce 
a binary outcome, ii) compute the marginal of the path's prob- 
abilistic available bandwidth from measurement outcomes and 



Figure 2: Multipath probabilistic available bandwidth estimation algorithm. 

Our method is based on four assumptions. 

1 . At the start of each link is a store-and-forward first-come 
first-served router/switch that dictates the behaviour of the 
Unk (in terms of delay, loss, utilization). If the network 
uses priority queueing or some other form of router-level 
Quality-of-Service provisioning, then our method will in- 
fer the probabilistic available bandwidth as seen by the 
class of packets transmitted as probes. 

2. The routing topology of this network is known, as embod- 
ied in the set of paths V, and that it remains fixed for the 
duration of our experiments. More precisely, we construct 
a MxN binary path matrix P, where V(i, j) is equal to one 
if link i is on path /. To populate the matrix, we infer 
links and the mapping from IP addresses to routers using 
traceroutqj. 

3. There is a unique path between each of the hosts involved 
in probing. If there is per-packet load balancing in the net- 
work, our traceroute-based procedure will identify only 
one of these paths traversed by packets. This error takes 
the form of missing correlations in the factor graph and 
could result in inaccurate estimates and/or slower conver- 
gence. Our method is unaff'ected by destination-based load 
balancing. 

4. Like the majority of utilization-based available bandwidth 
estimation tools, we assume that there is a single link (tight 
hnk) on each path that essentially determines the proba- 
bilistic available bandwidth of that path. More formally, 
each path consists of the set of links Lp - {{1,(2, ■■ ■, in) 
and a single tight link i* e lJJ. This allows us to i) 



^traceroute-like methods have been known to inflate the number of ob- 
served routers, record incorrect links and bias router degree distributions 13311 . 
However, it provides sufficiently accurate topology estimates for us to assess 
the performance of our algorithms. 

^We derive this relationship more formally in Sec. 13.2.21 



perform efficient inference using path-level data and ii) 
use logical topologies (combine all links that are in a se- 
ries) rather than routing topologies to reduce the number 
of links and the complexity of the factor graph. Jain and 
DovroUs i^ show that multiple tight links can lead to an 
underestimation of the available bandwidth. In our case, 
we interpret the presence of more than one tight link as a 
modelling inaccuracy that creates noise propagated in the 
factor graph during the execution of the beUef propagation 
algorithm. 



We revisit these assumptions in Sec. l4.2l and study how errors 
or changes in routing topology affect the performance of our 
algorithm. 

3.1. Probing Strategy 

Our probing strategy (line 5 in Fig.|2]l is a based on the princi- 
ple of self-induced congestion |8]. A single measurement con- 
sists of sending A^, trains of L, UDP packets of Pj,V(. bytes at a 
constant rate r^ and observing the rate r' at the receiver side. 
We then take the median of r' obtained from each of the A^, 
trains and determine the binary outcome z of the measurement 
using the following relation: z = l{r' > rp-e] where l{x) is the 
indicator function (equal to one if x is true and zero otherwise). 

To achieve a given input rate rp, we fix the packet size and 
calculate the time interval, t, between the departure of consec- 
utive packets according the the following relation: rp - Psize/T. 
The receiving rate is calculated similarly by dividing the total 
number of bytes received by the amount of time that elapsed be- 
tween the reception of the first and last packet. However, due to 
task interruption on the sender side there can be unusual delays 
between the departure of two consecutive packets (f,- > f,^i + r 
where f, is the departure time of packet /). We consider these 
packets invalid and exclude them before calculating the output 
rate. Upon reception of the last packet of a train, we construct 
a set V of all the indices / > 1 of valid packets and calculate r' 
as follows: r'^ = (\V\ ■ P size) I {Y^iev U - U-\)- 

The probing rate is selected at every iteration, but the other 
parameters are pre-determined before the beginning of the esti- 
mation procedure. The choice of these values is made to mini- 
mize the overhead while making sure that results are accurate. 
In active sampling techniques, the outcome of each measure- 
ment plays a role in determining what rate to probe next. Al- 
though using multiple trains (A^, > 1) and taking the median of 
the output rates increases the overhead on the network, it is also 
a way to mitigate the impact of a noisy measurement sequence 
(e.g. packet train with many invalid packets). A similar logic 
applies when choosing the size of each probe, Psize, and the 
number of probes in a train, Lj. Larger probes and longer trains 
provide more samples over which to average r' but also leads 
to a more significant load on the network and a longer sampling 
period. In the Sec. |4] we specify and justify our choices for 
each of these parameters. 

3.2. Bayesian Inference and Factor Graphs 

Bayesian inference is a classical way to update the knowl- 
edge about unknown parameters based on new observations. 



In this framework, the posterior distribution Y'\:{yp\zk) is pro- 
portional to the product of the conditional probability Pr(z^|yp), 
also called likelihood function, and the prior probability Pr(y^): 
Y'x{yp\zk) oc Pr(zj:|3'p)Pr(yp). We are interested in the marginal, 
for every path, of the joint posterior distribution of all paths 
Pr(3'i, ...,yM\i)- The joint probability distribution is complex 
but it is factorizable and can therefore be captured with a fac- 
tor graph (line 1 in Fig. |2]i: a graphical model "that indicates 
how a joint function of many variables factors into a product 
of functions of smaller sets of variables" 13411 . Factor graphs 
are composed of two types of nodes (variable and factor nodes) 
and edges that show dependencies between the variables and 
the factors. In our case, the variables are discrete random vari- 
ables of the probabilistic available bandwidth of each link, xt, 
and path, yp. There are three functions that are represented by 
factor nodes in the graph: i) the prior knowledge about the links, 
fx, ii) the relation between the PAB of Unks and paths, f^y, and 
iii) the likelihood of an observation on a given path, /y ;.. 

The marginal posteriors are computed (fine 6 in Fig. |2]l by 
running belief propagation on the factor graph ll35ll . The algo- 
rithm starts with each one of the leaf nodes (prior and likeli- 
hood) sending a message to its adjacent node. Messages are 
then computed using the sum-product algorithm and continue 
to propagate until the algorithm stabilizes, i.e. there is minimal 
or no variation between a newly computed message and the one 
previously sent of the same edgCl Upon completion it is pos- 
sible to compute the marginal at the variable node (links and 
paths) by taking the product of all messages incoming on its 
edges. 

Example: In Figure|3] we show an example of a simple logi- 
cal topology of a network. In this example, there are four nodes 
interconnected using N = 3 different links labeled {{,{2, ^3 and 
we consider M = 2 paths (dashed line: pi, solid line: 792) where 
nodes 1 and 2 are the sources and node 4 is the destination. 
From the logical topology, we can populate the path matrix P 
and use it to construct the factor graph. 

110" 
1 1 

In FigurelH we show the factor graph representation of the joint 
distribution used to compute marginal posteriors of the PAB of 
each of the three links and two paths. The edges show the vari- 
ables that the factors depend on. In this case, the prior function 
is identical for all links. So each variable node X( is connected 
to a factor node /v in the graph. However, we could easily use 
different functions for each link. Each path and its underlying 
set of links Lp are connected together to a factor node f^.y (there 
is an edge for every P(/, j) = 1 in the path matrix). Finally, we 
see that this specific factor graph includes information from a 
single observation that was performed on path pi. For each ad- 
ditional measurement, a new factor node /,, ,; is added to the 
factor graph. 



Belief propagation will converge in cyclic factor graphs under certain con- 
ditions, but is not guaranteed to do so |36]. Through our extensive simulations, 
we did not encounter any convergence issues. To ensure completion, we set the 
maximum number of messages between two nodes to five during one run of the 
belief propagation algorithm. 




Figure 3: Logical topology of a 4 nodes network with N = 3 links (£1,(2,^3) 
and M = 2 paths; pi (dashed) and p2 (solid). 




Figure 4: Factor graph representation used to estimate the PAB of the two paths 
in the topology depicted in Fig. [3] 



3.2.1. Prior function 

The first function to define is the prior f^. We use a non- 
informative prior model for the PAB of a path; a uniform distri- 
bution in the range [B,„/„, B,„ax]- 

Jx ^ ^l^min^ ^max]^ 

where fimin and Bmax are conservative estimates of the minimum 
and maximum probabiHstic available bandwidths of links. Our 
choice is due to the lack of any prior information about the PAB 
of links or paths. 

3.2.2. Relation between links and paths 

Our inference procedure relies on a relationship between the 
PAB of a path and the PABs of its constituent links. For the 
classical utilization-based definition of available bandwidth, it 
is often assumed that there is a single link on each path that de- 
termines that path's available bandwidth. We develop a similar 



relationship for the probabilistic available bandwidth. 

For a path p consisting of the set of links Lp - {1, 2, . . . , n}, 
it is possible to identify small constants Q < €( < YjteL,^^ < e 
and < 5f < 2feL ^f < 1 ~ T such that: 



Piir'f <rc- €() < 6[ for all r^ < y^ie, y). 



(1) 



but 



(2) 



(3) 



Pr(r^ <re- ei) > 6{ for all r( > yp(e, y). 
We can apply the union bound on the links to establish: 

Pr [j{r'f < re - €(] < ^ 5f . 

The complement of this union bound is that the condition 
r'^ > r{ - ee holds for each link. Then we have the following 
relationship between the path and link input and output rates: 

n = rj, 

ri = r\ > rp - ei 

rj, = r'^> rp -61-62 



r'p = r'„>rp-Yj^i- 



This relationship and the union bound in (O imply the fol- 
lowing: 



Pr 



'-'p>'p-Y,^( >i-X'^^ 



(4) 



Moreover, we assume that there is a tight link (* € Lp which 
essentially determines the probabilistic available bandwidth on 
the path p. This means that it is possible, for all I e Lp, i + {*, 
to identify ef «; e and (5f « 1 - y that satisfy ([T]l. In the case 
of (*, however, the smallest ec < f and Sc < 1 - y pair that 
satisfy ([TJ have the property ee> ^ e and 6c ~ I - y. The tight 
link assumption implies that Yj[eL,^f k e^ ^ e and YjeeL,^( ~ 
6(> ~ I - y. This property, together with ([1]), (|2]i, and (|4]i, 
imply that yp w x^ where xg is the PAB of link (. Another 
way of interpreting this assumption, is that the PAB of any link 
£ € Lp, { i^ i* is significantly greater than yp. This relationship 
is expressed mathematically as 

fx.yiyp, {xe\f e Lp]) = l{yp = min(xf)}, 

cGLp 

where l{x] is the indicator function. 

3.2.3. Likelihood Model 

Each measurement k is a {p, r^, Zk) triple that consists of the 
outcome Zk, the probed path p and probed rate r* We specify 
a likelihood function, /,, -, learned from empirical training data, 
that relates this outcome to the probe rate and the underlying 
PAB of the probed path. This function depends on the probing 
strategy and how the outcome of a measurement is determined. 



Intuitively, when the probing rate r^ is well below Vp, we expect 
the probability of observing z- 1 to be very high and, similarly, 
when Kp is well over jp, this probability should be very close to 
zero. Although a simple step function looks like a good match, 
it is too aggressive as we have observed higher levels of noise 
when we probe around y^. Based on these intuitive expectations 
and experimental data (Fig.|5]l, we adopt the likelihood model 

i(z = ^Jp^rp) = \og&\g{-a{rp-yp)) 

for the measurements, where a is a small positive constant 
learned empiricalljiJ. However, to determine the value of a we 
first need to estimate jp. We decide to co-jointly estimate the 
values of jp along with the constant a through a single regres- 
sion procedure where we determine the best fit by minimizing 
the MSB. 

We note that our estimation procedure is not sensitive to the 
exact choice of a, which specifies the rate of decay of the sig- 
moid function. Moreover, in experiments conducted on differ- 
ent topologies, days, and times-of-day, we have observed that 
the estimated a values occupy a small range. The values are 
related to the variability of the path PABs over the measure- 
ment interval. These observations suggest that it is possible to 
execute the training procedure rarely. 

Example: We construct a likelihood model for the network 
we used for our experiments using e = 5 Mbps and a range 
of values where B„„„ - 1 Mbps and B,„ax =100 Mbps. We 
first gather data from five different paths: 500 measurements 
from non-consecutive packet trains at each rate between B,„,„ 
and Bmax- We then repeat this experiment five times at different 
periods of the day resulting in 25 sets of 500 measurements. We 
normalize each of the 25 experiments and combine all the data 
in a single plot as a function of r^ - fp. The result is shown 
in Fig. |5] where each data point is the result of averaging all 
values which had the same value of Vp - y],; all experiments for 
which the distance between r^ and jp is identical. The function 
depicted is for y - 0.5, but it can be easily modified for any 
other value of y: it consists of aligning the desired value of y 
on the curve with the point on the x-axis where Vp -"yp - 0. 

As depicted in Fig.|4] after each measurement we add a func- 
tion node /,(' to the factor graph and connect it with an edge 
to the variable node jp of the path that was probed. There are 
two possible likelihood forms, depending on the outcome of the 
measurement; they are displayed in Fig.|6] 

If Zk - then the probing rate is smaller than the PAB: 

fypA = l0gsig(-Q'(r^ - Jp)). 

On the other hand, \f Zk - ^ then r^ > jp and 

fy,A = 1 - logsig(-a(r* - yp)). 




Figure 5: Empirical data and regression fit for the likelihood model. Pr(z = 1) 
is a function of the difference between the probing rate and estimated available 
bandwidth. Each data point is obtained by averaging the result of 10 packet 
trains with e = 5 over five dift'erent paths. The best fit is obtained by performing 
a regression for parameters a and yp . 



The product of all the f, factor nodes for a path represents 
the cumulative knowledge obtained from measurements on this 
path. In Fig. |7] we show the product of two likelihood func- 
tions resulting from two measurements made at path p, one at 



ri - 40 and one at rl - 60. 



L(z, = I y , r'') 
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Figure 6: Two possible values for fy,^, representing the knowledge about the 
PAB of path p obtained from a measurement r* = 60Mbps. 
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Figure 7: Knowledge about a path p's PAB from two measurements: 1) r' 
40, zi = l,2)r2=60,Z2=0. 



'The sigmoid function rapidly decays to zero when the probing rate is 
greater than the available bandwidth, even for the best possible parameter fit. 
We wish to be careful and prefer a slightly less aggressive approach where 
we assign some likelihood to unexpected measurement outcomes at all ingress 
rates. For that reason, we introduce a small constant k and bound our likelihood 
function to lie in the range [k, 1 - k\; in our experiments k = 0.02. 



5.5. Active Sampling 

The estimation of available bandwidth based on self-induced 
congestion is an iterative process. At every iteration, the prob- 
ing rate is chosen according to some rules. In the case of 
network-wide estimation, we must also determine which path 



to probe. The possible sampling rules used to make these se- 
lections can be divided in two groups: adaptive (active) or non- 
adaptive (passive). Non-adaptive sampling means that the se- 
quence of measurements is pre-determined; the probing rate at 
step k is not affected by previous measurements. These strate- 
gies are simple and easy to implement, but can be inefficient. 
Adaptive (active) selection algorithms, which use information 
extracted from previous measurements to make decisions about 
the future, can provide important reductions in the number of 
probes. 

3.3.1. Path Selection 

We now describe two greedy active learning procedures to 
select the path to probe at each iteration (line 3 in Fig.|2]). Both 
algorithms are probabilistic in nature: they determine the prob- 
ability that each path is chosen, and then the choice is accom- 
plished by making a random selection according to the specified 
probabilities. 

The first algorithm is called weighted entropy (WE). For each 
path, we can calculate the entropy of the marginal posterior dis- 
tribution of its PAB. The entropy is an indication of the un- 
certainty associated with the current estimate; so WE assigns a 
probability that a path is selected is proportional to the entropy 
of the distribution. The second algorithm, called weighted con- 
fidence interval (WCI), assigns a selection probability to each 
path that is proportional to the size of the current confidence in- 
terval y6p of the path's PAB; it then chooses a path at random ac- 
cording to the assigned probabilities. In both algorithms, paths 
are more likely to be probed if there is more uncertainty about 
their PABs and the probability of probing a path that already 
satisfies our stopping criteria (J3p < /3) is zero. 

3.3.2. Rate Selection 

To decide on the probing rate (line 4 in Fig.|2]i, previous es- 
timation tools either use deterministic binary search or simply 
increase the probing rate (linearly or exponentially) until it is 
greater than the available bandwidth. Our Bayesian framework 
allows us to adopt a more efficient and informative approach. 
We choose the rate that bisects the marginal posterior distri- 
bution of the path. By probing at the median, there is equal 
probability (according to our current knowledge) that the binary 
outcome will be Zk = 1 or z^ = 0. We therefore maximize the 
expected information gain from our measurement; it is equiva- 
lent to conducting a probabilistic binary search for the available 
bandwidth on path p l3lh . By using a probabilistic rather than 
deterministic approach in rate selection, hard decisions (which 
could be incorrect) are not enforced. 

4. Results and Discussion 

4.1. Path Selection Simulations 

The purpose of the simulations described in this subsection 
is to assess the efficacy of our proposed active sampling strate- 
gies. These are not network simulations, so they do not test 
modelling assumptions at all (that is the purpose of the simula- 
tions in Sec. l4.2l and the online experiments in Sec. 14.31 ). 
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Figure 8: Simulation results: measurements required and accuracy achieved. 
Results are averaged over 500 topologies of various sizes for different confi- 
dence levels ;; and intervals /?. 



We use the HOT topology generated using Orbia'^l which 
includes 939 nodes (896 end nodes) and 988 links. From this 
set of links and nodes, we construct a distance matrix between 
all the nodes using shortest path routing and identify 2232 paths 
(source-destination pairs) that consist of at least seven links. 
For our simulations, we wish to test our algorithm on topologies 
of different sizes and vary the number of paths over the range 
M = 50, 100, 150, 200, 250. For each value of M, we randomly 
select ten different subsets of M paths from the entire set of 
2232 paths. For each of these 50 topologies, we assign link 
PABs using a uniform distribution between [1, 100] and repeat 
this process ten times to generate a total of 500 topologies. 

At each iteration, probe outcomes are generated according to 
the likelihood model we constructed empirically in Sect 13.2.31 
(a = 0.28) for e = 5. For all simulations, y - 0.5, which 
means that the value of the likelihood function at yp - rp is 
0.5. We compare three path selection algorithms (Round Robin 
(RR), WE and WCI) and also show the average number of mea- 
surements and accuracy required when our active learning algo- 
rithm is run independently and sequentially on each path (SEQ). 
We use different values of y6 and // as stopping criteria; the al- 
gorithm stops when the size of the confidence interval [5p is 
smaller than p for all paths p. If these conditions are not met, 
the algorithm stops after 10000 iterations. 

Fig. |8] shows the number of measurements per path required 
for the algorithm to terminate, as well as the accuracy (an esti- 
mate is considered accurate if the real PAB lies within the con- 
fidence limits: /?„„■„ < jp < I3,„ax)- In most cases, SEQ re- 
quires fewer measurements than the round-robin strategy with 
the graphical model. This is due to the fact that not all paths 
require the same number of measurements. In the RR case, the 

*^http : //www ■ sysnet ■ ucsd ■ edu/- pmahadevan/topo_research/topo ■ html | 



algorithm iterates through all paths, including those that have 
already met the required confidence criteria, which is not the 
case in SEQ. Both data-driven approaches, WCI and WE, sig- 
nificantly reduce the number of measurements required while 
achieving satisfactory accuracy (i.e., the accuracy exceeds the 
requested confidence level //). 
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Figure 9: Simulated average number of measurements as a function of the num- 
ber of tight links in the topology. Both values are normaUzed by the number of 
paths M. We show all the simulated values and a first degree polynomial fit for 
each technique. 

We investigate the number of iterations for the case where 
7/ = 0.95, j8 = 10inFig.|9j we show the average number of mea- 
surements per path as a function of the number of tight links 
per path in the network. Due to the nature of our model, we 
can identify the PAB of each path if we know the PAB of all the 
tight links in the network. Therefore, we expect to make greater 
savings in terms of number of probes when the total number of 
tight links is small relative to the total number of paths (or, in 
other words, when the number of paths that share a single tight 
link is high). The average number of measurements per path 
required by WCI is between 46 - 73% lower than the number 
required by RR and 39 - 55% lower than SEQ. WE and WCI 
provide important savings in terms of time and measurements 
without affecting the accuracy, but since WCI is slightly better 
in terms of average number of measurements, we use WCI for 
our online experiments. As expected, when tight links are lo- 
cated on non-shared links, more measurements are required to 
achieve the same level of accuracy. 

4.2. Topology Accuracy Simulations 

Our methodology assumes that the logical topology is known 
and stable during the estimation procedure. We are interested in 
assessing the robustness of our approach relative to i) errors in- 
troduced in the physical topology extraction using traceroute 
and ii) changes in the real topology in the middle of the estima- 
tion. 

Let TE be the probabiUty that path p is incorrectly extracted 
using traceroute. For each erroneously extracted path, there 



is a probability qfnp that each link in the set £. is mistakenly 
identified as either present or missing from path ivA More con- 
cretely, for each row of P, there is a probability TE that every 
column entry is flipped with probability qjup. The result is a 
noisy factor graph (path matrix) that propagates inaccurate in- 
formation because of invalid edges between path and link vari- 
able nodes. 
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Figure 10: Average number of measurements per path as a function of the 
traceroute en'or for topologies with different number of paths M. 



For each of the 500 topologies we used in Sec. 14.11 we 
generate seven topologies by varying TE over the range 
0%, 5%, 15%, 25%, 50%, 75%, 90%. For the simulations, we 
use WCI for path selection, the same likelihood model with 
a = 0.28 and set y = 0.5, e = 5 Mbps, r] = 0.95, yS = 10 Mbps, 
B„,j„ - 1 Mbps, B„,ax - 100 Mbps. In Fig. [TOl we show the 
average number of measurements per path as a function of TE. 
As expected, the number of iterations required to achieve the 
requested confidence level and tightness increases for topolo- 
gies with a greater probability of traceroute error. However, 
this augmentation is not significant; even with TE = 90%, the 
estimation requires only 1.5 more measurements per path on 
average. 
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Figure 1 1 : Average estimation accuracy (Jaccai'd Similaiity Coefficient : 
B\/\A U B\) as a function of the topology accuracy for all topologies. 
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To quantify the similarity between topologies and provide a 
more meaningful metric than TE, we use the Jaccard similarity 
coefficient. It is equal to the size of the intersection (number 
of correctly identified links) divided by the size of the union 
(all links from both topologies) Il37ll . We display the average 



' ' This probability is chosen such that the average path length remains con- 
stant. Based on the topologies we used for our simulations, this probability 
depends on the number of links in the network and varies between 1-3%. 



accuracy of our estimates over all topologies in Fig. [TT] Our 
simulation results show that, for topologies of any size, as long 
as the traceroute methodology produces a matrix P with a 
similarity coefficient greater than 0.5, 85% of the paths are es- 
timated accurately on average. Therefore, even when it uses an 
inaccurate path matrix, our methodology can generate reason- 
ably precise estimates without any significant inflation in the 
number of probes required. 

As far as topology stability is concerned, we have not per- 
formed any simulations, but we have studied empirically the 
validity of our assumption. Before each of our online experi- 
ments, we generated the matrix P and studied its similarity with 
previous matrices for the same set of nodes. We conclude that 
the PlanetLab network is stable enough to assume that topolo- 
gies remain constant during the estimation procedure (Song and 
Yalagandula 1. 1 8.] made similar observations). Although it is 
probably safe to assume that the topologies are constant for 
even a longer period of time (at least 24 hours from our obser- 
vations), we continue to generate a new matrix P before every 
experiment since it is neither time nor resource consuming. It 
is important to note that the logical topology is not always af- 
fected by variations in the physical topology. Therefore, they 
do not necessarily imply modifications in the path matrix and 
the associated factor graph. 

4.3. Online experiments 

For our online experiments, we have deployed our measure- 
ment software coded in C on various nodes on the PlanetLab 
networlo We use a topology with six nodeo- M = 30 paths 
and N - 65 logical links. For all our experiments, the likeli- 
hood model is the one presented in Sec. l3.2.3] (with e = 5 and 
a - 0.28) and WCI is used to select the path to probe at each it- 
eration. Also, we choose B,„,„ - 1 Mbps and B,„„^ =100 Mbps 
as conservative estimates of the FAB of each link (we assume 
that the links with the highest capacity are 100 Mbps links). 

Each run includes an estimation of all the paths followed by a 
testing procedure. The estimation terminates when the stopping 
criteria, jS = 10 Mbps and rj - 0.95, are met for all paths. We 
validate our results by sending trains of 2400 packets of 1000 
bytes (the equivalent of 60 seconds of video encoded at 320 
kbps) and observing the output rate. For each run, we perform 
a total of 16 tests; four tests on four disjoint paths. In each of 
the tests, the sending rate of the train is different — the lower 
bound of the confidence interval /3,„,„, the lower bound plus e = 
5 Mbps, the upper bound of the confidence interval /3,„ax, and 
e = 5 Mbps above the upper bound. For each test, we compute 
the empirical probability that the output rate is within e = 5 
Mbps of the input rate (z = 1). 

In this first experiment, we set y = 0.5 and wish to verify if 
the confidence intervals produced include the value of the PAB. 
To do so, we compute the average over 20 runs of the empirical 



probability Pr(r' > rp-e) for each one of the four tests. For the 
probes, we use N, = 3 trains per measurement, a packet size of 
Psize - 1000 bytes and vary the number of packets in each train 
in the range L, = [25, 50, 100, 150, 200, 250]. 
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Figure 12: Empirical probability that the output rate is within e Mbps of the 
input rate. Each point represents the average of 80 test results (20 runs). 

In Fig. [12] we show the empirical probability (averaged over 
80 tests) that the output rate is within e = 5 Mbps of the input 
rate for four different probing rates relative to the confidence 
interval bounds. The first observation is that the number of 
packets used in trains induces very little variation in empirical 
probability for all the probing rates. This suggests that, for this 
network at least, 25 packets per train would suffice. For all the 
train sizes we tested, the desired probability y - 0.5 is included 
in the probability interval of /?,„,„ and /3max- This result con- 
firms that our method is able to produce intervals that include 
the value of the PAB accurately. The fact that -y = 0.5 is very 
close to the upper bound suggests that we might underestimate 
the PAB. We discuss possible reasons for this below. 
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Figure 13: Empirical probability of observing z = 1 averaged over 17966 mea- 
surements as a function of the difference between the probing rate and the esti- 
mated PAB (MAP of the marginal posterior). 



'-Although the PlanetLab ( http : //www . planet- lab . org/ S network was 
once believed to be too heavily loaded. Spring et al. 1 38] explained that Planet- 
Lab has evolved and this is no longer true. 

'■'planetlab3.csail.mit.edu, planetlab-l.cs.unibas.ch, planlabl.cs.caltech.edu, 
planetlab2.acis.ufl.edu, planetlabl.cs.stevens-tech.edu, planetlab2.csg.uzh.ch. 



We investigate the impact of the train size by using the raw 
data collected at each node during the 20 runs (18000 measure- 
ments for each value of L,). In Fig. [13] we show the average 
empirical probability of observing z = 1 as a function of the dif- 
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ference between the probing rate and our estimate of the PAB 
(here we use the marginal maximum a posteriori (MAP) es- 
timate). Since we set y = 0.5, we expect the probabiHty of 
observing z - 1 to be near 0.5 when the probing rate is equal 
to the PAB {rp - yp - 0). However, what we observe is that 
the probability is closer to 0.75 at that point, which is approxi- 
mately the average empirical probability at /?„„„ + e in Fig. [12] 
This confirms a slight underestimation of the PAB, which is 
probably due to an inaccurate likelihood model. The figure also 
shows that as the train size is reduced, the measurements be- 
come more noisy and the bias (underestimation) becomes more 
significant. 
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Figure 14: Number of measurements (TOP) and bytes (BOTTOM) used per 
path (averaged over 20 runs for each train sizes L,) during the estimation pro- 
cedure. 

Figs. [T2land[T3] indicate that the accuracy obtained when us- 
ing Ls - 25 and L, - 250 packets is similar In Fig. [141 we 
show the average number of measurements and bytes per path 
required to complete the estimation procedure as a function of 
the train size. Since the number of measurements is constant 
for all values of Lj, we observe a linear growth in the number 
of bytes required to achieve the desired accuracy. From these 
results, it is now clear that using 25 packets per train is optimal 
as it provides similar accuracy to larger train sizes with signifi- 
cant savings in terms of number of probes. 

In the previous experiment, where y - 0.5, the probabiUty of 
observing z - I when the input rate is equal to the lower bound 
of our confidence interval is 0.875. That probability drops to 
0.7 for the rate at the middle of our confidence interval (j3„,in + 
e). We perform another experiment of 20 runs with L, - 25 
and y = 0.9. By increasing y, we obtain higher guarantees for 
rates at the lower bound (0.97) and in the middle (0.86) of the 
confidence interval. However, increasing the value of y results 
in a larger number of measurements. For y = 0.9, the average 
number of measurements per path was 33+1 (compared to 12+1 
for y - 0.5) an augmentation of 175%). 

In Fig. [15] we display the confidence intervals as well as the 
test results (probe rate and output rate) for one of the runs per- 
formed with Ls = 25 and y - 0.5. The outcome of this partic- 



Figure 15: Bounds of the confidence intervals for a 30 paths topology in a 
sample run performed for L, = 25 and 7 = 0.5. 



ular run demonstrate the clear heterogeneity of the PlanetLab 
network; over 25% of the paths have small (less than 20 Mbps) 
PAB whereas the other 75% have PAB greater than 80 Mbps. 
The tight links on the paths with lower PAB could either be 
heavily utilized 100 Mbps links or, more likely, 10 Mbps links 
with small amounts of cross-traffic. These findings about the 
PlanetLab network correspond to those of Lee et al. i3S 



Table 1 : Average time and bytes used by Pathload and our approach for M = 30 
paths topology over 5 runs. 

seconds / path kbytes / path 



Pathload 
Our Approach 



27.0 ±0.8 
7.1+0.3 



10806 + 1058 
612 + 18 



It is interesting to compare our estimation methodology to 
another tool based on the classical definition of available band- 
width to examine the extent of correlation between the two met- 
rics. We choose to compare our results with those obtained us- 
ing Pathload (version 1.3.2) |9] because it is known to be very 
accurate. Using the same topology described above (M = 30, 
N = 65), we run both Pathload sequentially on every single path 
and our algorithm (WCI and Ls = 25, y = 0.5). Since the met- 
rics are different, a complete correspondence between the esti- 
mates is not expected. Nonetheless, both estimation techniques 
strive to examine the same path property (at what rate can probe 
trains be sent without inducing congestion). The confidence in- 
tervals obtained from both tools overlap for 53% of the paths 
- 76% if we tolerate a 2Mbps error. This highlights the fact 
that there is a certain level of correspondence between the two 
metrics. Table[T]compares the number of bytes transmitted and 
time elapsed. We can see that our approach provides signifi- 
cant gains in terms of measurement latency (75% savings) and 
overhead (95% savings). 

Comparing the overhead of our technique with Pathload's 
confirms that previous tools are not well suited to multi-path 
estimation. The only other approaches that can produce effi- 
cient network- wide AB estimates are BRoute |20] and band- 
width landmarking ll22ll . In both cases, there is very little de- 
tails on the actual overhead incurred by their techniques. Hu 
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and Steenkiste [20] claim that 80% of the available bandwidth 
estimates obtained from BRoute are accurate within 50% when 
using a subset that includes only 7% of all paths. However, 
there is no mention of how many measurements are required 
for each path. 

5. Conclusion 

In this paper, we presented a novel technique based on a 
probabilistic framework to estimate network-wide probabilistic 
available bandwidth. We introduced PAB, a new metric with 
adjustable parameters that addresses issues related to the dy- 
namics and variability of available bandwidth. Our method- 
ology based on factor graphs and active sampling is the first 
to combine both techniques in the context of available band- 
width estimation. To further reduce the overhead of our tech- 
nique, we are currently working on a new measurement strategy 
and likelihood model based on chirps rather than trains of pack- 
ets, which, from our preliminary results, can achieve significant 
savings in terms of probing overhead. 
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